Address reduction blindly identifies non-random data series
نویسندگان
چکیده
We introduce a method of detecting data series (curves) which exhibit pattern without knowing what kind of pattern they contain. By partitioning the space of curves into neighbourhoods, we show that the curves with the shortest addresses are the most likely to result from simple underlying mechanisms. We show that address reduction is a bound on Kolmogorov complexity and is invariant over noise and one-to-one transformations. We use it to blindly identify gene expression profiles in yeast cell cycle and the segmentation clock, and to segregate humanand computer-generated random data.
منابع مشابه
Dimensionality Reduction for Indexing Time Series Based on the Minimum Distance
We address the problem of efficient similarity search based on the minimum distance in large time series databases. To support minimum distance queries, most of previous work has to take the preprocessing step of vertical shifting. However, the vertical shifting has an additional overhead in building index. In this paper, we propose a novel dimensionality reduction technique for indexing time s...
متن کاملOn the Detection of Trends in Time Series of Functional Data
A sequence of functions (curves) collected over time is called a functional time series. Functional time series analysis is one of the popular research areas in which statistics from such data are frequently observed. The main purpose of the functional time series is to predict and describe random mechanisms that resulted in generating the data. To do so, it is needed to decompose functional ti...
متن کاملModified Maximum Likelihood Estimation in First-Order Autoregressive Moving Average Models with some Non-Normal Residuals
When modeling time series data using autoregressive-moving average processes, it is a common practice to presume that the residuals are normally distributed. However, sometimes we encounter non-normal residuals and asymmetry of data marginal distribution. Despite widespread use of pure autoregressive processes for modeling non-normal time series, the autoregressive-moving average models have le...
متن کاملEfficient Non-Oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee
In this paper, we address learning problems for high dimensional data. Previously, oblivious random projection based approaches that project high dimensional features onto a random subspace have been used in practice for tackling highdimensionality challenge in machine learning. Recently, various non-oblivious randomized reduction methods have been developed and deployed for solving many numeri...
متن کامل1-D random landscapes and non-random data series
We study the simplest random landscape, the curve formed by joining consecutive data points f1, . . . , fN+1 with line segments, where the fi are i.i.d. random numbers and fi = fj . We label each segment increasing (+) or decreasing (−) and call this string of +’s and −’s the up-down signature σ. We calculate the probability P (σ(f)) for a random curve and use it to bound the algorithmic inform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006